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the  method  of  moment  estimators  are  always  strongly  consistent,  it  is 
expected  that  the  feasibility  probability  should  also  increase  to  one. 
But  some  of  the  counter  intuitive  results  are  obtained.  In  some  cases 
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1.  INTRODUCTION 


Method  of  moments  (MM)  plays  an  important  role  in  the  estimation  of  parameters  in 
parametric  inference.  MMestimatorhasalonghistory  starting  with  the  work  of  Fisher  (1922).  One 
of  the  major  advantages  of  using  MM  estimator  is  its  simplicity  in  using  in  practice  (Rao  (1973)). 
In  many  situations  it  is  observed  that  MM  estimators  can  be  obtained  by  solving  simple  equations 
whereas  other  estimators  like  maximum  likelihood  (ML)  estimators  may  be  obtained  through 
minimizing  or  maximizing  certain  function.  Sometimes  it  may  happen  that  ML  estimator  may  not 
exist,  whereas  MM  estimator  exists.  In  certain  situations,  the  MM  estimator  may  be  used  as  the  initial 
guess  value  to  obtain  the  ML  estimator  in  some  numerical  search  procedure.  So,  it  is  quite  important 
to  study  the  properties  of  the  MM  estimators  at  least  for  small  sample.  In  most  of  the  situations  MM 
estimators  are  consistent  and  behave  reasonably  well  for  large  sample  sizes,  although  they  may  not 
be  as  “efficient”  as  ML  estimators  (see  Fisher  1922).  Another  major  disadvantage  of  the  MM 
estimators,  at  least  for  small  sample  sizes,  is  that  they  may  not  be  feasible.  Here,  feasibility  means  that 
even  though  there  may  be  some  restrictions  on  the  parameter  space  but  the  estimators  may  not  satisfy 
those  restrictions.  We  define  formally  the  feasible  estimator  as  follows: 

Definhinn.  Let  X,,  ^, ...,  X;,  be  a  random  sample  fi-om  the  density  function  f(x;  0)  and  6  e  0,  where 
0  is  a  subset  of  R^  Let  be  an  estimator  of  0.  If  e  0,  then  we  say  that6„  is  a  feasible  estimator 
of  0.  If  0„  is  not  a  feasible  estimator  of  0,  then  it  is  called  an  infeasible  estimator  of  0. 

Observe  that  if  converges  either  in  probability  or  in  almost  surely  and  the  limit  is  also  an 
infeasible  estimator,  then  it  is  inconsistent  as  well  as  asymptotically  biased.  Therefore,  it  is  useful  to 
study  the  probability  of  obtaining  a  infeasible  MM  estimators  in  any  given  situation,  at  least  for  small 
sample  sizes.  We  are  not  familiar  with  any  literature  on  this  topic  except  the  very  recent  paper  of 


Dupuis  (1996),  although  all  the  text  books  on  statistics  mention  about  the  MM  estimators.  In  her 
paper,  Dupuis  (1996)  considers  a  very  special  case  of  a  generalized  Pareto  distribution  and  presents 
some  simple  programs  to  predict  the  probability  of  obtaining  such  infeasible  estimators  for  large 

sample  sizes. 

The  main  aim  of  this  paper  is  to  consider  first  some  simple  examples  and  compute  the 
probability  of  infeasible  MM  estimators  explicitly  and  then  we  consider  some  of  the  cases  when  it 
cannot  be  computed  explicitly.  We  recommend  to  use  Monte  Carlo  simulations  to  compute  the 
probability  of  obtaining  the  infeasible  MM  estimators  in  those  situations.  It  is  known  (Hosking  and 
Wallis;  1987)  that  for  the  generalized  Pareto  distribution  the  MM  estimators  behave  better  than  the 
ML  estimators  for  sample  sizes  up  to  500  for  certain  ranges  of  parameters.  So  it  is  reasonable  to  use 
MM  estimators  for  small  samples  sizes  in  the  case  of  generalized  Pareto  distribution.  Dupuis  (1996) 
considers  the  generalized  Pareto  distribution  but  she  considers  only  the  larger  sample  (sample  size 
ranges  from  500  to  10000)  and  obtain  the  probability  of  infeasible  MM  estimators.  We  also  consider 
the  generalized  Pareto  distribution  but  mainly  for  small  sample  sizes  and  make  some  comments  about 
Dupuis  (1996)’s  approach. 

The  rest  of  the  paper  is  organized  as  follows:  In  Section  2,  we  consider  one  parameter  and 
two  parameter  exponential  distributions.  In  Section  3,  we  consider  uniform  distribution,  generalized 
Pareto  distribution  is  considered  in  Section  4.  Some  numerical  experiments  are  performed  in  Section 


5.  Finally,  we  draw  conclusions  in  Section  6. 


2.  EXPONENTIAL  DISTRIBUTION 


First  we  consider  one  parameter  exponential  distribution.  Let  Xj,  Xj, X„  be  a  random 

sample  of  size  n  from  the  following  distribution 

f/„w  ifx>a 

^  '  [0  otherwise 

Observe  that  in  this  situation  the  MM  estimator  of  a  ,  say  ,  is  X  -  1 .  Now  there  are  two  kinds 
of  feasibility  questions  about  .  First  of  all  if  we  have  information  (prior)  that  a  2;  0  (for  example 
X  represents  the  lifetime)  then 

Xi>  a„  ^  0  «  X  ^  1  and  Xj  >  X  -  1  for  all  i  =  l,2....,n  or  X^^  ^  X  -  1, 
where  X^)  denotes  the  ith  order  statistic.  Otherwise,  is  not  feasible.  Therefore, 

P  [  ttjj  is  feasible  ]  =  P[X^  1,X^  ^  ^ 

=  P[  n  <  53  Xj  ^  n  X(i)  +  n  ] 

^  ^  n 

=  P[  n(l  -  X„p  i  E  <X,|,  -  X,,^  i  n  ] 

i  =  l 

=  nf  f  g(y)dy  e dx  ^  nf  f  g(y)dy  e dx,  if  a  <  1  (2.1) 

a  n(l-x)  1  ® 

00  n 

=  nj  J  g(y)dy  dx  ,  if  a  ^  1  (2-2) 

a  0 

— 1 — y“-2e-y  y>0  ,23^ 

where  g  (y)  =  r(n-l)  ^ 

0  otherwise  . 

Using  the  fact  that  f  g(y)dy  =  1  -  E  ,  (2.1)  can  be  written  as 

J  i=o  l! 


1  n-2 

n/E  ( 

•'  i  =  0 


i! 


i! 


-n(i-  X)  n  ‘  (1  -  x)‘  _  g-n  n_!.  )  dx  +  (1  -  E  e'"  ^  ) 

i=o  •' 


i! 


^  ^  g-n(i-o)  (n(l  -tt))‘  _  e 
i!  i=o 

Also,  (2.2)  can  be  simplified  as 

/  g(y)<iy  =  1  -  E  e'"  fr- 

0 


In  the  case  we  have  no  information  on  cx, 

n 

P  [  ttjj  is  feasible  ]  =  P  [  X  <  1  ]  “  P  [  E  ~  ^(i)^  ^  ] 

n  n-2  i 

=  f  g(y)dy  =  1  ■  E  ®  ’"  TT  • 

J  i=0  l! 

0 

Next  we  consider  the  two  parameter  exponential  distribution.  Let  Xi,  X2, ...,  X„  be  a  random  sample 
of  size  n  from  the  following  density  function 


f(x)  = 


i.  e  -(x-aye 
0 

0 


X  >  c(,  6  >  0 
otherwise  . 


It  can  be  easily  seen  that  the  MM  estimators  of  a  and  0  are 
&„  =  X  -  S  and  =  S, 

where  nS  V(n-l)  is  the  sample  variance.  Again,  we  may  or  may  not  have  prior  information  on  a 

being  positive.  In  case  a  ^  0 , 

P  [  is  feasible  ]  =  P  [  S  <  X  <  X^j^  +  S  ]  . 

If  we  have  no  restriction  on  a,  then 
■  P  [  is  feasible  ]  =  P  [  X  <  X^j^  +  S  ]  . 

© 


(2.5) 


Obsewe  that  it  is  not  very  easy  to  compute  (2.4)  and  (2.5)  explicitly  although  (2.4)  depends 
only  on  a/G  and  (2.5)  is  independent  of  a  and  0.  We  perform  some  Monte  Carlo  simulations  to 
estimate  (2.4)  and  (2.5)  in  Section  5. 


3.  UNIFORM  DISTRIBUTION 

Suppose  X„  X2,  ...,  X„  is  a  random  sample  of  size  n  from  0(0,0).  Then  observe  that  the  MM 
estimator  of  0  is  =  2X.  Therefore, 

P  [  0JJ  is  feasible  ]  =  P  [  X;  ^  2X  for  all  i  =  1,2, ...,n  ] 

X(„)] 


V)  ^  '"(i) 

i  =  l 


0  2 


0"r(n-l)J  i 


rBili 
e  1—1 


0”r(n-l)i 


0  I  2  ^ 


]  =  1 

n- 1  0 

-  p  [  E  .  2 

[^] 

2 

E 

i  =  0 

(-ly 

“lix 

(":■)  /  (z-ix)r^ 

IX 

(-ly  1 

rn-l) 

0”  r(n)  i  = 

=  \  -  T  i-iy - ^ - (—  -  ir' . 

h  ^  ^  r(iH-i)r(n-i)  2 


(3.1) 


where  (z  -  ix)^  denotes  max{0,  (z  -  ix)}.  Observe  that  (3.1)  does  not  depend  on  0  and  it  is  a 
deaeasing  function  of  n.  Simple  numerical  computations  on  MAPLE  show  that  (3.1)  decreases  to 

V2  as  n  tends  to  <». 

Now  suppose  that  Xj,  Xj, ...,  X„  is  a  random  sample  of  size  n  from  U(0-a,  0+a),  where  a  is 


(3.2) 


known  and  0  is  unknown.  The  corresponding  density  fiinction  is 

.  .  [  (2a)->  if  e  -  a  s  X  s  e  *  a, 

[  0  otherwise. 

Without  loss  of  generality  let  us  assume  that  a  =  Vz.  The  MM  estimator  of  0  is  =  X.  It  is 
interesting  to  note  that  ML  estimator  is  not  unique  in  this  case,  for  example  any  value  between  (X(„) 
-'/4)  to  (X(i)+  Vi)  maximizes  the  likelihood  function  although  MM  estimator  is  unique. 

P  [  0^  is  feasible  ]  =  P  [  X--^  <  X^j^  <  X^^^  <  X  +  -  ] 

=  P  [  (n-l)X(„  -  X,„  -  ^  s  E  s  (n-l)X,„  -  X,„,  *  |  ]  (3-3) 

Observe  that  the  conditional  distribution  of  ^(o  ^n)  is  the  same  as  the  distribution 

i  =  2 

of  the  sum  of  random  sample  of  size  (n-2)  from  U(X(i„  X^^,).  Therefore,  (3.3)  can  be  evaluated 
explicitly  but  the  actual  expression  become  very  messy.  However,  it  is  clear  that  it  does  not  depend 
on  0.  We  have  performed  simulations  to  estimate  (3.3)  in  section  5. 

4.  GENERALIZED  PARETO  DISTRIBUTION 
In  this  section  we  consider  some  feasibUity  problem  concerning  Generalized  Pareto  (GP) 
distribution.  The  GP  distribution  was  introduced  by  Pickands  (1975)  and  it  has  applications  in  a 
number  of  fields  including  reliability  studies  and  the  analysis  of  environmental  extreme  events.  It  is 
a  two  parameter  distribution  and  it  contains  uniform,  exponential  and  Pareto  distribution  as  special 

cases. 

Let  the  random  variable  X  follows  a  GP  distribution,  then  X  has  the  density  function 


if  k  #  0 


(4.1) 


f(x)  = 


kx,"  ' 


-(1-  —r 

a  a 

— exp(-x/a) 
a 


if  k  =  0, 


the  range  of  x  is  0  <  x  <  »  for  k  <  0  and  0  <  x  <  a/k  for  k  >  0.  The  parameters  of  the  distribution 
are  a,  the  scale  parameter,  and  k,  the  shape  parameter.  The  special  cases  k  =  0  and  k  =  1  yield,  the 
exponential  distribution  and  uniform  distribution  on  [0,  a],  respectively.  For  k  <  0  the  ordinary 
Pareto  distribution  is  obtained. 

It  is  important  to  observe  that  for  k  <  0,  the  r*  moment  of  X  exists  if  k  >  -1/r  (see  Hosking 
and  Wallis;  1987)  and  it  can  be  easily  seen  that  if  0  >  k  >  -1/2,  then 
E(X)=  a/(l+k)  and  Var(X)  =  aV  ((1+k)^  (l+2k))  (4-2) 

On  the  other  hand  if  k  ^  0  then  all  the  moments  of  X  exist,  which  can  be  easily  seen  as  follows  for 


k>0 


a/k 

E(X")  =  -  f  X  “  (1 
a  J 


■ '  dx 
a 


1-1 


=  — —  r  y”  (i-y)*"  dy 

um  +  1  J 


a" 


Tifrr.  ^1  It  = _ m!  g" _ 

— _  B(m+  ,  (km+l)(k(m-l)+l)...(k+l) 


(4.3) 


and  for  exponential  distribution  (k  =  0)  all  the  moments  exist.  However,  Hosking  and  Wallis  (1987, 
p.  340)  commented  that  for  k  ^  '/2,  X  has  infinite  variance,  which  is  not  correct. 

Therefore  for  -Vi  <  k  <  0,  the  MM  estimators  of  a  and  k  exist  and  they  are  as  follows: 

(Hosking  and  Wallis;  1987) 


i6S 


(4.4) 


a  =  X  ^  k„  =  i  (^  - 1) 

n  ”  2s^ 

where  X  and  nSV(n-l)  are  sample  mean  and  variance  respectively. 

For  k  >  0,  the  MM  estimators  exist  and  they  are  the  same  as  in  defined  (4.4).  It  is  important 
to  note  that  for  the  GP  distribution  the  ML  estimators  do  not  exist  for  k  >  1  (Grimshaw  (1993)).  This 
is  because  for  k  >  1  the  likelihood  function  converges  to  <»  as  a/k  tends  to  X(„)  from  the  upper  side. 

However,  the  MM  estimators  do  exist  in  this  case. 

Now  we  would  like  to  consider  the  feasibility  of  the  MM  estimators  in  two  different 
situations,  namely  for  k  >  0  and  for  -'/2  <  k  <  0.  This  is  because  for  k  =  0,  the  MM  estimator  of  a  is 

always  feasible. 

Case  I;  k  >  0. 


P  [  k„  and  are  feasible  ] 

=  P  [  k^  >  0  and  Xj  ^  /  k„  for  all  i  =  l,2,...,n  ] 

=  P  [  k„  >  0  and  X(„)  ^  /  k„  ] 


=  P  [  —  >  1  and  — — -  ^  ] 

(fr  - ') 

=  p  [  ~~  s  (C.V.)"  <  1  ], 

X,., 


(4.5) 


where  C.  V.  denotes  the  coefficient  of  variation. 

CaseH:  ->/2^k<0. 

In  this  case  is  always  feasible. 

P  [  k„  is  feasible  ]  =  P[  -  0  ] 


=  P  [  ]  =  P  [  (C.Y.f  >  1  ]  .  (4.6) 

In  both  the  cases  (4.5)  and  (4.6)  it  is  not  easy  to  evaluate  the  probabilities  explicitly.  We  propose  to 
use  Monte  Carlo  simulation  to  estimate  (4.5)  and  (4.6).  However,  observe  that  (4.6)  converges  to 

1  as  n  tends  to  »  due  to  the  strong  law  of  large  numbers. 

Dupuis  (1996)  considers  the  case  when  0  <  k  ^  */2  but  unfortunately  she  does  not  consider  the 
feasibility  of  k^.  Instead,  she  considers  the  feasibility  of  and  k„  given  that  k„>  0,  which  may  not 
be  correct.  Observe  that  it  is  important  to  know  the  range  of  k  because  the  likelihood  function,  the 
range  of  the  data  vector  and  also  the  feasibility  questions  are  quite  different  in  different  ranges.  For 
example  when  k  =  0,  there  is  no  question  of  feasibility  of  MM  estimators.  In  fact  in  our  simulation 
it  is  observed  that  P[  k„  is  infeasible]  may  not  be  negligible  for  small  positive  k.  Another  point 
regarding  the  work  of  Dupuis  (1996)  we  would  like  to  mention  that  she  made  the  following  statement 

If  Xi,  X2, X„  is  a  random  sample  of  size  n  from  GP  distribution,  then 
P  [  All  Xj  <  a„/k„  for  all  i  =  l,2,...,n  ]  (4.7) 

=  (  P  [  X  <  a„/k„  ]  )"  (4-8) 

Here  &  and  k  are  same  as  defined  in  (4.4),  and  X  is  GP  with  parameters  a  and  k.  We  feel  that 

n  n 

(4.7)  does  not  imply  (4.8)  due  to  the  fact  that  the  events  in  (4.7)  are  not  independent.  Unfortunately 
her  most  of  the  analysis  and  approximations  are  based  on  (4.8). 

5.  NUMERICAL  EXPERIMENTS 

In  this  section,  we  present  some  Monte  Carlo  simulation  results  to  estimate  (2.4),  (2.5),  (3.3), 
(4.5)  and  (4.6)  where  the  explicit  expressions  are  not  possible.  All  these  simulations  are  performed 
on  Sun  Workstation  using  the  random  deviate  generator  RAN2  of  Press  et  al  (1986).  In  all  the  cases 


we  use  RAN2  as  the  uniform  random  deviate  generator  and  using  the  proper  transformation  we 
obtain  the  different  distributions.  All  the  results  reported  here  are  based  on  10,000  replications.  We 
use  the  sample  sizes  10,  15,  50,  100,  500,  and  1,000  in  ail  the  cases. 

In  the  case  of  two  parameters  exponential  distribution,  to  estimate  the  probability  of  feasible  MM 
estimator,  when  there  is  a  prior  information  on  a  (i.e.  (2.4))  and  when  there  is  no  prior  information 
on  a  (i.e.  (2.5)),  we  consider  a  =  .25,  .50,  1.0,  2.0  and  0  =  1.0  (without  loss  of  generality).  The 
results  are  reported  in  Table  1.  It  is  observed  that  when  we  have  restrictions  on  a,  namely  a  >  0,  then 
fQi-  fixed  ct  as  n  increases  the  probability  of  feasible  MM  estimator  of  ot  increases  and  it  seems  it  is 
increasing  to  V2  as  in  the  case  of  one  parameter  exponential  family.  For  fixed  0,  the  probability  of 
feaable  estimator  of  a  increases  as  a  increases  to  1  and  after  that  (for  a  >  1)  it  remains  constant.  We 
compute  the  result  for  a  =  3  and  a  =4  also  and  it  is  exactly  same  as  a  =  2  and  we  do  not  report  the 
results  separately.  As  sample  size  increases  the  probability  becomes  independent  of  a.  When  we  do 
not  have  restrictions  on  a,  then  the  feasible  probability  is  independent  of  a  and  0  and  it  is  the  same 
probability  as  we  obtain  in  the  previous  case  for  a  >  1  (last  row  of  Table  1). 

In  case  of  uniform  distribution  to  estimate  the  probability  of  feasible  MM  estimator  (i.e.  (3.3)) 
we  consider  0  =  ‘/4  without  loss  of  generality  because  (3.3)  is  independent  of  0.  The  results  are 
reported  in  Table  2.  From  the  table  it  is  clear  that  as  sample  size  increases  the  probability  of  feasible 
MM  estimator  is  gradually  decreasing  and  it  seems  the  probability  is  converging  to  zero.  This  may 
be  due  to  the  fact  that  the  convergence  of  X(,)  (X(„))  to  0  -  >72  (0  +  Vz)  is  much  faster  than  the 
convergence  of  X  to  0. 

In  case  of  GP  distribution  to  estimate  the  probability  of  feasible  MM  estimators  for  -14  <  k 


<  0  (i.e.  (4.6))  we  consider  k  =  -.4,  -.3,  -.2,  1  and  for  k  >  0  (i.e.  (4.5))  we  take  k  =  .  1,  .2,  .3,  .4,  ,5, 
.75, 1.0, 1.25, 1.50,  and  a  =  1  (without  loss  of  generality)  in  both  the  cases.  The  results  for  -V2  <  k 

<  0  and  for  k  >  0  are  reported  in  Table  3  and  Table  4  respectively. 

In  Table  3,  when  -Y2  <  k  <  0,  it  is  observed  that  as  sample  size  increases  the  probability  is 
increasing  to  one.  It  is  also  observed  that  as  k  approaches  zero,  the  probability  decreases  and  it  is 
quite  prominent  at  least  for  small  sample  sizes.  The  results  are  quite  different  for  k  >  0  in  Table  4. 
Distinct  features  are  observed  for  0  <  k  <  Vi  and  for  k  ^  Vi.  For  0  <  k  <  Vi  it  is  observed  that  as  n 
increases  the  probability  is  increasing  and  it  seems  it  is  increasing  to  one.  On  the  other  hand  if  k  ^ 
Vi  the  probability  is  quite  erratic  and  no  such  conclusions  can  be  made. 

6.  CONCLUSIONS 

In  this  paper  we  consider  the  problem  of  feasibility  of  MM  estimators  when  there  are  some 
restrictions  on  the  parametric  space  and/or  the  range  of  the  data  depending  on  parameters.  We  feel 
it  is  an  important  problem  although  it  did  not  get  enough  attention  in  the  literature.  Since  the  MM 
estimators  are  always  consistent  (by  strong  law  of  large  numbers),  it  is  expected  that  they  will  be 
feasible  also.  But  we  observe  some  of  the  counter  intuitive  results  in  some  situations.  Simulation 
results  indicate  that  the  feasibility  might  even  converge  to  zero  also.  Since  the  MM  estimators  do  not 
take  into  consideration  the  restrictions  on  the  parametric  space  and/or  the  range  depending  on  the 
parameters,  this  work  clearly  suggests  that  practitioner  must  check  out  the  feasibility  of  the  MM 
estimators  in  such  situations  before  using  them. 
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Table  1 


Two  parameters  exponential  (Restrictions  on  a) 


0=1 


a\n 

10 

15 

50 

100 

500 

1000 

.25 

.3450 

.3456 

.4003 

.4358 

.4739 

.4758 

.50 

.3914 

.4022 

.4262 

.4455 

.4739 

.4758 

1.0 

.4049 

.4118 

.4275 

.4457 

.4739 

.4758 

2.0 

.4060 

.4122 

.4275 

.4457 

.4739 

.4758 

Table  2 

Uniform  [0  -  Vi,  0  +  Vi] 


n 

10 

15 

50 

100 

500 

1000 

Prob 

.6998 

.6093 

.3701 

.2665 

.1203 

.0876 

Table  3 

Generalized  Pareto:  o  =  1,  k  <  0 


k\n 

10 

15 

50 

100 

500 

1000 

-.4 

.5621 

.6894 

.9506 

.9954 

1.00 

1.00 

-.3 

.5024 

.6015 

.8901 

.9756 

1.00 

1.00 

-.2 

.4143 

.5120 

.7920 

.9130 

.9996 

1.00 

-.1 

.3328 

.4043 

.6079 

.7264 

.9667 

.9967 

Table  4 

Generalized  Pareto  k  >  0,  a  =  1 


k\n 

10 

15 

50 

100 

500 

1000 

.10 

.7169 

.7351 

.8317 

.8986 

.9932 

.9998 

.20 

.7469 

.7901 

.8974 

.9461 

.9808 

.9865 

.30 

.7524 

.7903 

.8744 

.8909 

.9182 

.9333 

.40 

.7375 

.7722 

.8088 

.8143 

.8261 

.8275 

.50 

.7179 

.7336 

.7354 

.7314 

.7221 

.7275 

.75 

.6390 

.6443 

.6145 

.5924 

.5669 

.5539 

1.0 

.5777 

.5600 

.5378 

.5275 

.5134 

.5118 

1.25 

.5187 

.5255 

.4997 

.5016 

.4964 

.4974 

1.50 

.5038 

.4865 

.4886 

.4786 

.4923 

.4961 

